-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches #10024
fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches #10024
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10024 +/- ##
==========================================
- Coverage 64.24% 63.61% -0.64%
==========================================
Files 572 572
Lines 54158 53657 -501
==========================================
- Hits 34794 34132 -662
- Misses 17384 17582 +198
+ Partials 1980 1943 -37
|
Alrighty, all tests pass. Please help me take a look folks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a changelog entry be added otherwise LGTM.
@odeke-em can you merge master, then the bot will merge this PR |
Thank you @marbar3778! I've added a CHANGELOG.md entry as well as rebased from the master branch. Thank you @marbar3778 and @robert-zaremba for the code reviews! |
… searches This change takes the observation that previous dbm.IsKeyInDomain which searches for [start, end) was performing too many byteslice comparisons. Instead we start off by sorting all the values in the store.unsortedCache, and then apply a modified binary search to look for values that fall within the domain [start, end) The procedure involves: * iterating over all items to build a list of all keys -- O(n) * invoking sort.Strings immediately, of which we anyways eventually invoke sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case * invoking modified binary search which is O(log(n)) * 2 ~ O(log(n)) to search for the [start, end) range indices for a total approximate complexity of: Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n)) Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2) instead of previously: * iterating over all the unsorted items and invoking dbm.IsKeyInDomain: bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end) for overall complexity of O(n*s*e) * invoking sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case for a total approximate complexity of: Best case: O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2) Worst case: O(n) + O(n*s*e) + O(n^2) ~= O(n*s*e) ~ O(n^2) Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons between (start & key, end & key) are profound that it makes sense to keep them as factors. The overall benchmark results vindicate our choice of isolating the factors (n*s*e) The benchmarks show that as the number of keys to iterate grows, the new code grows gracefully in a somewhat linear growth, notice for CAcheKVStoreIterator*, when we go from: * 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new * 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new ```shell time/op GetValidator-8 5.8ms ± 2% 4.7ms ± 1% -17.69% (p=0.000 n=10+10) OneBankSendTxPerBlock-8 3.2ms ± 2% 2.8ms ± 1% -10.80% (p=0.000 n=7+10) OneBankMultiSendTxPerBlock-8 3.1ms ± 3% 2.9ms ± 2% -8.36% (p=0.000 n=10+10) AccountMapperSetAccount-8 8.6µs ± 1% 7.8µs ± 1% -9.74% (p=0.000 n=10+10) CacheKVStoreIterator500-8 64µs ± 6% 51µs ± 6% -19.22% (p=0.000 n=10+9) CacheKVStoreIterator1000-8 0.12ms ± 4% 95µs ± 4% -19.55% (p=0.000 n=10+10) CacheKVStoreIterator10000-8 1.6ms ± 4% 0.90ms ± 1% -42.11% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 19ms ± 5% 5.5ms ± 1% -71.35% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10s ± 23% 17ms ± 7% -83.44% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 1.3µs ± 6% 0.90µs ± 3% -31.19% (p=0.000 n=9+9) CacheKVStoreGetKeyFound-8 0.66µs ± 6% 0.56µs ± 2% -14.81% (p=0.000 n=10+9) alloc/op B/op BlockProvision-8 0.11kB ± 0% 0.10kB ± 0% -7.14% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 0.89MB ± 6% 0.53MB ± 1% -40.85% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 6.3MB ± 23% 1.6MB ± 6% -74.17% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 0.26kB ± 0% 0.23kB ± 1% -11.53% (p=0.000 n=10+8) allocs/op (count) AccountMapperSetAccount-8 42 ± 0% 38 ± 0% -9.52% (p=0.000 n=10+10) BlockProvision-8 6.0 ± 0% 5.0 ± 0% -16.67% (p=0.000 n=10+10) CacheKVStoreIterator1000-8 14 ± 0% 13 ± 0% -7.14% (p=0.002 n=8+10) CacheKVStoreIterator10000-8 0.15k ± 2% 76 ± 1% -49.00% (p=0.000 n=7+10) CacheKVStoreIterator50000-8 8.9k ± 11% 2.0k ± 2% -77.60% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10M ± 26% 13k ± 12% -86.89% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 5.0 ± 0% 4.0 ± 0% -20.00% (p=0.000 n=10+10) ``` Note: Purposefully using a commit off master that doesn't include the buggy code that caused x/bank.BenchmarkOneBank* to fail per issue #10023 Updates #9876
…binary searches (#10024) This change takes the observation that previous dbm.IsKeyInDomain which searches for [start, end) was performing too many byteslice comparisons. Instead we start off by sorting all the values in the store.unsortedCache, and then apply a modified binary search to look for values that fall within the domain [start, end) The procedure involves: * iterating over all items to build a list of all keys -- O(n) * invoking sort.Strings immediately, of which we anyways eventually invoke sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case * invoking modified binary search which is O(log(n)) * 2 ~ O(log(n)) to search for the [start, end) range indices for a total approximate complexity of: Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n)) Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2) instead of previously: * iterating over all the unsorted items and invoking dbm.IsKeyInDomain: bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end) for overall complexity of O(n*s*e) * invoking sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case for a total approximate complexity of: Best case: O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2) Worst case: O(n) + O(n*s*e) + O(n^2) ~= O(n*s*e) ~ O(n^2) Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons between (start & key, end & key) are profound that it makes sense to keep them as factors. The overall benchmark results vindicate our choice of isolating the factors (n*s*e) The benchmarks show that as the number of keys to iterate grows, the new code grows gracefully in a somewhat linear growth, notice for CAcheKVStoreIterator*, when we go from: * 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new * 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new ```shell time/op GetValidator-8 5.8ms ± 2% 4.7ms ± 1% -17.69% (p=0.000 n=10+10) OneBankSendTxPerBlock-8 3.2ms ± 2% 2.8ms ± 1% -10.80% (p=0.000 n=7+10) OneBankMultiSendTxPerBlock-8 3.1ms ± 3% 2.9ms ± 2% -8.36% (p=0.000 n=10+10) AccountMapperSetAccount-8 8.6µs ± 1% 7.8µs ± 1% -9.74% (p=0.000 n=10+10) CacheKVStoreIterator500-8 64µs ± 6% 51µs ± 6% -19.22% (p=0.000 n=10+9) CacheKVStoreIterator1000-8 0.12ms ± 4% 95µs ± 4% -19.55% (p=0.000 n=10+10) CacheKVStoreIterator10000-8 1.6ms ± 4% 0.90ms ± 1% -42.11% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 19ms ± 5% 5.5ms ± 1% -71.35% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10s ± 23% 17ms ± 7% -83.44% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 1.3µs ± 6% 0.90µs ± 3% -31.19% (p=0.000 n=9+9) CacheKVStoreGetKeyFound-8 0.66µs ± 6% 0.56µs ± 2% -14.81% (p=0.000 n=10+9) alloc/op B/op BlockProvision-8 0.11kB ± 0% 0.10kB ± 0% -7.14% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 0.89MB ± 6% 0.53MB ± 1% -40.85% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 6.3MB ± 23% 1.6MB ± 6% -74.17% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 0.26kB ± 0% 0.23kB ± 1% -11.53% (p=0.000 n=10+8) allocs/op (count) AccountMapperSetAccount-8 42 ± 0% 38 ± 0% -9.52% (p=0.000 n=10+10) BlockProvision-8 6.0 ± 0% 5.0 ± 0% -16.67% (p=0.000 n=10+10) CacheKVStoreIterator1000-8 14 ± 0% 13 ± 0% -7.14% (p=0.002 n=8+10) CacheKVStoreIterator10000-8 0.15k ± 2% 76 ± 1% -49.00% (p=0.000 n=7+10) CacheKVStoreIterator50000-8 8.9k ± 11% 2.0k ± 2% -77.60% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10M ± 26% 13k ± 12% -86.89% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 5.0 ± 0% 4.0 ± 0% -20.00% (p=0.000 n=10+10) ``` Note: Purposefully using a commit off master that doesn't include the buggy code that caused x/bank.BenchmarkOneBank* to fail per issue #10023 Updates #9876 /cc @cuonglm @kirbyquerby <!-- The default pull request template is for types feat, fix, or refactor. For other templates, add one of the following parameters to the url: - template=docs.md - template=other.md --> ## Description Closes: #XXXX <!-- Add a description of the changes that this PR introduces and the files that are the most critical to review. --> --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [x] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 3c85944) # Conflicts: # CHANGELOG.md
…binary searches (backport #10024) (#10370) * fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (#10024) This change takes the observation that previous dbm.IsKeyInDomain which searches for [start, end) was performing too many byteslice comparisons. Instead we start off by sorting all the values in the store.unsortedCache, and then apply a modified binary search to look for values that fall within the domain [start, end) The procedure involves: * iterating over all items to build a list of all keys -- O(n) * invoking sort.Strings immediately, of which we anyways eventually invoke sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case * invoking modified binary search which is O(log(n)) * 2 ~ O(log(n)) to search for the [start, end) range indices for a total approximate complexity of: Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n)) Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2) instead of previously: * iterating over all the unsorted items and invoking dbm.IsKeyInDomain: bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end) for overall complexity of O(n*s*e) * invoking sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case for a total approximate complexity of: Best case: O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2) Worst case: O(n) + O(n*s*e) + O(n^2) ~= O(n*s*e) ~ O(n^2) Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons between (start & key, end & key) are profound that it makes sense to keep them as factors. The overall benchmark results vindicate our choice of isolating the factors (n*s*e) The benchmarks show that as the number of keys to iterate grows, the new code grows gracefully in a somewhat linear growth, notice for CAcheKVStoreIterator*, when we go from: * 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new * 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new ```shell time/op GetValidator-8 5.8ms ± 2% 4.7ms ± 1% -17.69% (p=0.000 n=10+10) OneBankSendTxPerBlock-8 3.2ms ± 2% 2.8ms ± 1% -10.80% (p=0.000 n=7+10) OneBankMultiSendTxPerBlock-8 3.1ms ± 3% 2.9ms ± 2% -8.36% (p=0.000 n=10+10) AccountMapperSetAccount-8 8.6µs ± 1% 7.8µs ± 1% -9.74% (p=0.000 n=10+10) CacheKVStoreIterator500-8 64µs ± 6% 51µs ± 6% -19.22% (p=0.000 n=10+9) CacheKVStoreIterator1000-8 0.12ms ± 4% 95µs ± 4% -19.55% (p=0.000 n=10+10) CacheKVStoreIterator10000-8 1.6ms ± 4% 0.90ms ± 1% -42.11% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 19ms ± 5% 5.5ms ± 1% -71.35% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10s ± 23% 17ms ± 7% -83.44% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 1.3µs ± 6% 0.90µs ± 3% -31.19% (p=0.000 n=9+9) CacheKVStoreGetKeyFound-8 0.66µs ± 6% 0.56µs ± 2% -14.81% (p=0.000 n=10+9) alloc/op B/op BlockProvision-8 0.11kB ± 0% 0.10kB ± 0% -7.14% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 0.89MB ± 6% 0.53MB ± 1% -40.85% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 6.3MB ± 23% 1.6MB ± 6% -74.17% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 0.26kB ± 0% 0.23kB ± 1% -11.53% (p=0.000 n=10+8) allocs/op (count) AccountMapperSetAccount-8 42 ± 0% 38 ± 0% -9.52% (p=0.000 n=10+10) BlockProvision-8 6.0 ± 0% 5.0 ± 0% -16.67% (p=0.000 n=10+10) CacheKVStoreIterator1000-8 14 ± 0% 13 ± 0% -7.14% (p=0.002 n=8+10) CacheKVStoreIterator10000-8 0.15k ± 2% 76 ± 1% -49.00% (p=0.000 n=7+10) CacheKVStoreIterator50000-8 8.9k ± 11% 2.0k ± 2% -77.60% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10M ± 26% 13k ± 12% -86.89% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 5.0 ± 0% 4.0 ± 0% -20.00% (p=0.000 n=10+10) ``` Note: Purposefully using a commit off master that doesn't include the buggy code that caused x/bank.BenchmarkOneBank* to fail per issue #10023 Updates #9876 /cc @cuonglm @kirbyquerby <!-- The default pull request template is for types feat, fix, or refactor. For other templates, add one of the following parameters to the url: - template=docs.md - template=other.md --> ## Description Closes: #XXXX <!-- Add a description of the changes that this PR introduces and the files that are the most critical to review. --> --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [x] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 3c85944) # Conflicts: # CHANGELOG.md * fix conflict Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: marbar3778 <marbar3778@yahoo.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
…binary searches (backport cosmos#10024) (cosmos#10370) * fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (cosmos#10024) This change takes the observation that previous dbm.IsKeyInDomain which searches for [start, end) was performing too many byteslice comparisons. Instead we start off by sorting all the values in the store.unsortedCache, and then apply a modified binary search to look for values that fall within the domain [start, end) The procedure involves: * iterating over all items to build a list of all keys -- O(n) * invoking sort.Strings immediately, of which we anyways eventually invoke sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case * invoking modified binary search which is O(log(n)) * 2 ~ O(log(n)) to search for the [start, end) range indices for a total approximate complexity of: Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n)) Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2) instead of previously: * iterating over all the unsorted items and invoking dbm.IsKeyInDomain: bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end) for overall complexity of O(n*s*e) * invoking sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case for a total approximate complexity of: Best case: O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2) Worst case: O(n) + O(n*s*e) + O(n^2) ~= O(n*s*e) ~ O(n^2) Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons between (start & key, end & key) are profound that it makes sense to keep them as factors. The overall benchmark results vindicate our choice of isolating the factors (n*s*e) The benchmarks show that as the number of keys to iterate grows, the new code grows gracefully in a somewhat linear growth, notice for CAcheKVStoreIterator*, when we go from: * 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new * 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new ```shell time/op GetValidator-8 5.8ms ± 2% 4.7ms ± 1% -17.69% (p=0.000 n=10+10) OneBankSendTxPerBlock-8 3.2ms ± 2% 2.8ms ± 1% -10.80% (p=0.000 n=7+10) OneBankMultiSendTxPerBlock-8 3.1ms ± 3% 2.9ms ± 2% -8.36% (p=0.000 n=10+10) AccountMapperSetAccount-8 8.6µs ± 1% 7.8µs ± 1% -9.74% (p=0.000 n=10+10) CacheKVStoreIterator500-8 64µs ± 6% 51µs ± 6% -19.22% (p=0.000 n=10+9) CacheKVStoreIterator1000-8 0.12ms ± 4% 95µs ± 4% -19.55% (p=0.000 n=10+10) CacheKVStoreIterator10000-8 1.6ms ± 4% 0.90ms ± 1% -42.11% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 19ms ± 5% 5.5ms ± 1% -71.35% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10s ± 23% 17ms ± 7% -83.44% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 1.3µs ± 6% 0.90µs ± 3% -31.19% (p=0.000 n=9+9) CacheKVStoreGetKeyFound-8 0.66µs ± 6% 0.56µs ± 2% -14.81% (p=0.000 n=10+9) alloc/op B/op BlockProvision-8 0.11kB ± 0% 0.10kB ± 0% -7.14% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 0.89MB ± 6% 0.53MB ± 1% -40.85% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 6.3MB ± 23% 1.6MB ± 6% -74.17% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 0.26kB ± 0% 0.23kB ± 1% -11.53% (p=0.000 n=10+8) allocs/op (count) AccountMapperSetAccount-8 42 ± 0% 38 ± 0% -9.52% (p=0.000 n=10+10) BlockProvision-8 6.0 ± 0% 5.0 ± 0% -16.67% (p=0.000 n=10+10) CacheKVStoreIterator1000-8 14 ± 0% 13 ± 0% -7.14% (p=0.002 n=8+10) CacheKVStoreIterator10000-8 0.15k ± 2% 76 ± 1% -49.00% (p=0.000 n=7+10) CacheKVStoreIterator50000-8 8.9k ± 11% 2.0k ± 2% -77.60% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10M ± 26% 13k ± 12% -86.89% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 5.0 ± 0% 4.0 ± 0% -20.00% (p=0.000 n=10+10) ``` Note: Purposefully using a commit off master that doesn't include the buggy code that caused x/bank.BenchmarkOneBank* to fail per issue cosmos#10023 Updates cosmos#9876 /cc @cuonglm @kirbyquerby <!-- The default pull request template is for types feat, fix, or refactor. For other templates, add one of the following parameters to the url: - template=docs.md - template=other.md --> ## Description Closes: #XXXX <!-- Add a description of the changes that this PR introduces and the files that are the most critical to review. --> --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [x] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 3c85944) # Conflicts: # CHANGELOG.md * fix conflict Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: marbar3778 <marbar3778@yahoo.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
…binary searches (backport cosmos#10024) (cosmos#10370) * fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (cosmos#10024) This change takes the observation that previous dbm.IsKeyInDomain which searches for [start, end) was performing too many byteslice comparisons. Instead we start off by sorting all the values in the store.unsortedCache, and then apply a modified binary search to look for values that fall within the domain [start, end) The procedure involves: * iterating over all items to build a list of all keys -- O(n) * invoking sort.Strings immediately, of which we anyways eventually invoke sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case * invoking modified binary search which is O(log(n)) * 2 ~ O(log(n)) to search for the [start, end) range indices for a total approximate complexity of: Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n)) Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2) instead of previously: * iterating over all the unsorted items and invoking dbm.IsKeyInDomain: bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end) for overall complexity of O(n*s*e) * invoking sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case for a total approximate complexity of: Best case: O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2) Worst case: O(n) + O(n*s*e) + O(n^2) ~= O(n*s*e) ~ O(n^2) Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons between (start & key, end & key) are profound that it makes sense to keep them as factors. The overall benchmark results vindicate our choice of isolating the factors (n*s*e) The benchmarks show that as the number of keys to iterate grows, the new code grows gracefully in a somewhat linear growth, notice for CAcheKVStoreIterator*, when we go from: * 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new * 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new ```shell time/op GetValidator-8 5.8ms ± 2% 4.7ms ± 1% -17.69% (p=0.000 n=10+10) OneBankSendTxPerBlock-8 3.2ms ± 2% 2.8ms ± 1% -10.80% (p=0.000 n=7+10) OneBankMultiSendTxPerBlock-8 3.1ms ± 3% 2.9ms ± 2% -8.36% (p=0.000 n=10+10) AccountMapperSetAccount-8 8.6µs ± 1% 7.8µs ± 1% -9.74% (p=0.000 n=10+10) CacheKVStoreIterator500-8 64µs ± 6% 51µs ± 6% -19.22% (p=0.000 n=10+9) CacheKVStoreIterator1000-8 0.12ms ± 4% 95µs ± 4% -19.55% (p=0.000 n=10+10) CacheKVStoreIterator10000-8 1.6ms ± 4% 0.90ms ± 1% -42.11% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 19ms ± 5% 5.5ms ± 1% -71.35% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10s ± 23% 17ms ± 7% -83.44% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 1.3µs ± 6% 0.90µs ± 3% -31.19% (p=0.000 n=9+9) CacheKVStoreGetKeyFound-8 0.66µs ± 6% 0.56µs ± 2% -14.81% (p=0.000 n=10+9) alloc/op B/op BlockProvision-8 0.11kB ± 0% 0.10kB ± 0% -7.14% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 0.89MB ± 6% 0.53MB ± 1% -40.85% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 6.3MB ± 23% 1.6MB ± 6% -74.17% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 0.26kB ± 0% 0.23kB ± 1% -11.53% (p=0.000 n=10+8) allocs/op (count) AccountMapperSetAccount-8 42 ± 0% 38 ± 0% -9.52% (p=0.000 n=10+10) BlockProvision-8 6.0 ± 0% 5.0 ± 0% -16.67% (p=0.000 n=10+10) CacheKVStoreIterator1000-8 14 ± 0% 13 ± 0% -7.14% (p=0.002 n=8+10) CacheKVStoreIterator10000-8 0.15k ± 2% 76 ± 1% -49.00% (p=0.000 n=7+10) CacheKVStoreIterator50000-8 8.9k ± 11% 2.0k ± 2% -77.60% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10M ± 26% 13k ± 12% -86.89% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 5.0 ± 0% 4.0 ± 0% -20.00% (p=0.000 n=10+10) ``` Note: Purposefully using a commit off master that doesn't include the buggy code that caused x/bank.BenchmarkOneBank* to fail per issue cosmos#10023 Updates cosmos#9876 /cc @cuonglm @kirbyquerby <!-- The default pull request template is for types feat, fix, or refactor. For other templates, add one of the following parameters to the url: - template=docs.md - template=other.md --> ## Description Closes: #XXXX <!-- Add a description of the changes that this PR introduces and the files that are the most critical to review. --> --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [x] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 3c85944) # Conflicts: # CHANGELOG.md * fix conflict Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: marbar3778 <marbar3778@yahoo.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
## Describe your changes and provide context Cherry picking changes here: * [\#14168](cosmos/cosmos-sdk#14168) perf: store/cachekv: preallocate kvL in dirtyItems which gets appended too * [\#10024](cosmos/cosmos-sdk#10024) fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches #10024 ## Testing performed to validate your change Ran a cluster ![image](https://user-images.githubusercontent.com/18161326/207636380-1c404827-cdb4-4bdc-aade-bfc0294f6cd4.png)
…binary searches (backport cosmos#10024) (cosmos#10370) * fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (cosmos#10024) This change takes the observation that previous dbm.IsKeyInDomain which searches for [start, end) was performing too many byteslice comparisons. Instead we start off by sorting all the values in the store.unsortedCache, and then apply a modified binary search to look for values that fall within the domain [start, end) The procedure involves: * iterating over all items to build a list of all keys -- O(n) * invoking sort.Strings immediately, of which we anyways eventually invoke sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case * invoking modified binary search which is O(log(n)) * 2 ~ O(log(n)) to search for the [start, end) range indices for a total approximate complexity of: Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n)) Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2) instead of previously: * iterating over all the unsorted items and invoking dbm.IsKeyInDomain: bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end) for overall complexity of O(n*s*e) * invoking sort.Slice(unsorted, ...) which uses Quicksort -- O(nlog(n)) or O(n^2) worst case for a total approximate complexity of: Best case: O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2) Worst case: O(n) + O(n*s*e) + O(n^2) ~= O(n*s*e) ~ O(n^2) Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons between (start & key, end & key) are profound that it makes sense to keep them as factors. The overall benchmark results vindicate our choice of isolating the factors (n*s*e) The benchmarks show that as the number of keys to iterate grows, the new code grows gracefully in a somewhat linear growth, notice for CAcheKVStoreIterator*, when we go from: * 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new * 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new ```shell time/op GetValidator-8 5.8ms ± 2% 4.7ms ± 1% -17.69% (p=0.000 n=10+10) OneBankSendTxPerBlock-8 3.2ms ± 2% 2.8ms ± 1% -10.80% (p=0.000 n=7+10) OneBankMultiSendTxPerBlock-8 3.1ms ± 3% 2.9ms ± 2% -8.36% (p=0.000 n=10+10) AccountMapperSetAccount-8 8.6µs ± 1% 7.8µs ± 1% -9.74% (p=0.000 n=10+10) CacheKVStoreIterator500-8 64µs ± 6% 51µs ± 6% -19.22% (p=0.000 n=10+9) CacheKVStoreIterator1000-8 0.12ms ± 4% 95µs ± 4% -19.55% (p=0.000 n=10+10) CacheKVStoreIterator10000-8 1.6ms ± 4% 0.90ms ± 1% -42.11% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 19ms ± 5% 5.5ms ± 1% -71.35% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10s ± 23% 17ms ± 7% -83.44% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 1.3µs ± 6% 0.90µs ± 3% -31.19% (p=0.000 n=9+9) CacheKVStoreGetKeyFound-8 0.66µs ± 6% 0.56µs ± 2% -14.81% (p=0.000 n=10+9) alloc/op B/op BlockProvision-8 0.11kB ± 0% 0.10kB ± 0% -7.14% (p=0.000 n=10+10) CacheKVStoreIterator50000-8 0.89MB ± 6% 0.53MB ± 1% -40.85% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 6.3MB ± 23% 1.6MB ± 6% -74.17% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 0.26kB ± 0% 0.23kB ± 1% -11.53% (p=0.000 n=10+8) allocs/op (count) AccountMapperSetAccount-8 42 ± 0% 38 ± 0% -9.52% (p=0.000 n=10+10) BlockProvision-8 6.0 ± 0% 5.0 ± 0% -16.67% (p=0.000 n=10+10) CacheKVStoreIterator1000-8 14 ± 0% 13 ± 0% -7.14% (p=0.002 n=8+10) CacheKVStoreIterator10000-8 0.15k ± 2% 76 ± 1% -49.00% (p=0.000 n=7+10) CacheKVStoreIterator50000-8 8.9k ± 11% 2.0k ± 2% -77.60% (p=0.000 n=10+10) CacheKVStoreIterator100000-8 0.10M ± 26% 13k ± 12% -86.89% (p=0.000 n=10+10) CacheKVStoreGetNoKeyFound-8 5.0 ± 0% 4.0 ± 0% -20.00% (p=0.000 n=10+10) ``` Note: Purposefully using a commit off master that doesn't include the buggy code that caused x/bank.BenchmarkOneBank* to fail per issue cosmos#10023 Updates cosmos#9876 /cc @cuonglm @kirbyquerby <!-- The default pull request template is for types feat, fix, or refactor. For other templates, add one of the following parameters to the url: - template=docs.md - template=other.md --> ## Description Closes: #XXXX <!-- Add a description of the changes that this PR introduces and the files that are the most critical to review. --> --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [x] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 3c85944) # Conflicts: # CHANGELOG.md * fix conflict Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: marbar3778 <marbar3778@yahoo.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
to search for the [start, end) range indices
for a total approximate complexity of:
Best case: O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n)) ~= O(n^2)
instead of previously:
bytes.Compare ~ O(n) + O(nse) where s -- len(start), e -- len(end)
for overall complexity of O(nse)
Quicksort -- O(nlog(n)) or O(n^2) worst case
for a total approximate complexity of:
Best case: O(n) + O(nse) + O(nlog(n)) ~= O(nse) ~ O(n^2)
Worst case: O(n) + O(nse) + O(n^2) ~= O(nse) ~ O(n^2)
Ordinarily we'd combine the nse to be nm, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (ns*e)
The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue #10023
Updates #9876
/cc @cuonglm @kirbyquerby
Description
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
to the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
!
in the type prefix if API or client breaking change